A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech

نویسنده

  • Alex Acero
چکیده

This paper presents a time-scale pitch-scale modification technique for concatenative speech synthesis. The method is based on a frequency domain source-filter model, where the source is modeled as a mixed excitation. This model is highly coupled with a compression scheme that result in compact acoustic inventories. When compared to the approach in the Whistler system using no mixed excitation, the new method shows improvement in voiced fricatives and over-stretched voiced sounds. In addition, it allows for spectral manipulation such as smoothing of discontinuities at unit boundaries, voice transformations or loudness equalization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhanced shape-invariant pitch and time-scale modification for concatenative speech synthesis

To preserve shape-invariance when pitch or time-scale modifying sinusoidally modelled voiced speech, the phases of the sinusoids used to model the glottal excitation are made to add coherently at estimated excitation points. Previous methods achieve this by estimating excitation phases at synthesis frame boundaries, disregarding the frequency modulation that may occur between the frame boundary...

متن کامل

VLSI implementation of a TSM/FSM algorithm

The time scale modification (TSM) of speech is concerned with the compressing or expanding of audio signals in the time domain without affecting the signals pitch or naturalness. Conversely, the frequency scale modification (FSM) of speech is concerned with altering the pitch and formants of a signal without changing the signal duration. This paper describes a hardware implemented and optimized...

متن کامل

Shape-invariant pitch and time-scale modification of speech by variable order phase interpolation

To preserve the waveform shape and perceived quality of pitch and time-scale modified sinusoidally modelled voiced speech, the phases of the sinusoids used to model the glottal excitation are made to add coherently at estimated pitch pulse locations. The glottal excitation is therefore made to resemble a pseudoperiodic impulse train, a quality essential for shape-invariance. Conventional method...

متن کامل

Non-parametric techniques for pitch-scale and time-scale modification of speech

Time-scale and, to a lesser extent, pitch-scale modifications of speech and audio signals are the subject of major theoretical and practical interest. Applications are numerous, including, to name but a few, text-to-speech synthesis (based on acoustical unit concatenation), transformation of voice characteristics, foreign language learning but also audio monitoring or film/soundtrack post-synch...

متن کامل

Voice source analysis for pitch-scale modification of speech signals

Much research has shown that the voice source has strong influence on the quality of speech processing [4][5][6]. But in most of the existing speech modification algorithms, the effect of the voice source variation is neglected. This work explains why the existing modification scheme can’t truly reflect the voice source variation during pitch modification. We use synthesized voiced speech sound...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998